class: center, middle, inverse, title-slide .title[ # Three common mistakes in statistics and how to avoid them ] .author[ ### Elizabeth Pankratz ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- ## Something you won't be able to unsee: -- .pull-left[  Reeder et al. (2017) in **Journal of Memory and Language.** ] .pull-right[  Elazar et al. (2022) in **Cognitive Science.** <br>  Harrigan et al. (2022) in **Language.** ] --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] -- .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as continuous numeric. ] .pull-right[ ] -- .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] -- .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- ### The data: Students' anxiety ratings for "Going to ask my statistics teacher for individual help with material I am having difficulty understanding." <img src="data:image/png;base64,#demo_files/figure-html/bar-aggregated-1.png" width="576" style="display: block; margin: auto;" /> ] --- ### The data: Students' anxiety ratings for "Going to ask my statistics teacher for individual help with material I am having difficulty understanding." .pull-left[ ``` r slice(anx, 45:50) ``` ``` ## # A tibble: 6 × 3 ## unique_id gender rating ## <chr> <chr> <dbl> ## 1 7d28c303 Female/Woman 4 ## 2 7d55383a Another Gender 4 ## 3 8116550a Female/Woman 1 ## 4 83491ff9 Female/Woman 4 ## 5 8450f8ad Male/Man 2 ## 6 876547d6 Female/Woman 3 ``` ] -- .pull-right[ `rating` looks like numbers, and R treats it like numbers. And we can manipulate it like numbers. ``` r mean(anx$rating) ``` ``` ## [1] 2.868054 ``` ] --- ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- ## Remember: We are smarter than R is Store categorical variables as factors. ``` r anx <- anx |> mutate(rating = factor(rating)) ``` -- Now it's impossible to incorrectly treat them as if they're numeric! ``` r mean(anx$rating) ``` ``` ## [1] NA ``` --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ ] .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ ] --- count:false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ ] --- ## Model ordinal data with `polr()` polr = **P**roportional **O**dds **L**ogistic **R**egression -- ``` r library(MASS) # MASS contains the polr() function anx_fit1 <- polr( rating ~ 1, # intercept-only model, to start data = anx, Hess = TRUE, method = 'probit' # ask me in the Q+A! ) ``` --- ## Model ordinal data with `polr()` ``` r summary(anx_fit1) ``` ``` ## Call: ## polr(formula = rating ~ 1, data = anx, Hess = TRUE, method = "probit") ## ## No coefficients ## ## Intercepts: ## Value Std. Error t value ## 1|2 -0.8420 0.0157 -53.7268 ## 2|3 -0.1678 0.0138 -12.1462 ## 3|4 0.3833 0.0141 27.1512 ## 4|5 1.0339 0.0168 61.6193 ## ## Residual Deviance: 26596.28 ## AIC: 26604.28 ``` --- ## What do those `Intercepts` mean? -- <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal-1.png" width="864" style="display: block; margin: auto;" /> ??? - imagine that there's some underlying continuous normal distribution of anxiety, assumed standard normal [show normal distrib] - ppl with high anxiety are more likely to give high responses, ppl with low anxiety more likely to give low responses (could do emojis relating to anxiety:
,
) - so to estimate how different anxiety levels translate to different responses on the 1--5 scale, we draw thresholds on that distribution [add thresholds] - ppl with anxiety in this bin will respond with 1, in this bin with 2, etc. - and those thresholds, the cutpoints btwn ratings, are the intercepts. - [show intercept estimates, put thoes same numbers on the thresholds] - normal distribution assumption is from method = probit. other methods assume other underlying distributions, but the idea of thresholds is the same. --- count: false ## What do those `Intercepts` mean? <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal2-1.png" width="864" style="display: block; margin: auto;" /> --- count: false ## What do those `Intercepts` mean? <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal3-1.png" width="864" style="display: block; margin: auto;" /> --- #### How does a student's gender affect how they respond to "Going to ask my statistics teacher for individual help with material I am having difficulty understanding"? -- .pull-left[ <img src="data:image/png;base64,#demo_files/figure-html/plot-gender-bars-1.png" width="504" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="data:image/png;base64,#demo_files/figure-html/normals-stacked-onlyfem-1.png" width="504" style="display: block; margin: auto;" /> ] --- .center[ <img src="data:image/png;base64,#demo_files/figure-html/fem-normal-solo-1.png" width="936" style="display: block; margin: auto;" /> ] --- count: false <img src="data:image/png;base64,#demo_files/figure-html/fem-mal-normals-1.png" width="936" style="display: block; margin: auto;" /> --- count: false <img src="data:image/png;base64,#demo_files/figure-html/all-gender-normals-1.png" width="936" style="display: block; margin: auto;" /> --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ ] --- count: false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ ] --- ## Are the effects of `gender` significant? ``` ## Coefficients: ## Value Std. Error t value ## genderMale/Man -0.3280 0.03015 -10.880 ## genderAnother Gender 0.4846 0.11992 4.041 ``` No *p*-values in the model summary. -- But it's common practice to compare these *t*-values to a standard normal distribution, like *z*-scores. -- <img src="data:image/png;base64,#demo_files/figure-html/zscore-mm-1.png" width="648" style="display: block; margin: auto;" /> <img src="data:image/png;base64,#demo_files/figure-html/zscore-ag-1.png" width="648" style="display: block; margin: auto;" /> ??? Since both *p*-values are below 0.05: - we CAN reject the null hypothesis that gender has no effect on ratings. - **we CANNOT conclude that there really is an effect of gender.** --- ### Why don't significant *p*-values mean an effect exists? Because we can also get significant *p*-values when there really is *no* effect. -- .pull-left[ No difference in the true population: <img src="data:image/png;base64,#demo_files/figure-html/true-skew-probdist-1.png" width="504" style="display: block; margin: auto;" /> ] -- .pull-right[ A possible random sample (*n* = 50 per group): <img src="data:image/png;base64,#demo_files/figure-html/simdat-1.png" width="504" style="display: block; margin: auto;" /> ] --- ### Why don't significant *p*-values mean an effect exists? ``` r sim_fit <- polr(rating ~ group, data = simdat, method = 'probit', Hess = TRUE) summary(sim_fit) ``` ``` ## Coefficients: ## Value Std. Error t value ## groupGroup B -0.4479 0.2229 -2.009 ``` <br> -- <img src="data:image/png;base64,#demo_files/figure-html/zscore-sim-1.png" width="648" style="display: block; margin: auto;" /> <br> So *p* is significant, but in the true population, Group A and Group B were identical! --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ ] --- count:false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists. ] .pull-right[ Understand that significant *p*-values can arise even if no effect exists. ] -- <br> .center[**Thank you!
Time for questions!**] --- count: false ## References Elazar, A., Alhama, R. G., Bogaerts, L., Siegelman, N., Baus, C., & Frost, R. (2022). When the "tabula" is anything but "rasa": What determines performance in the auditory statistical learning task? *Cognitive Science*, 46(2), e13102. Harrigan, K., Hogoboom, A., & Cochrane, L. (2022). Furthering student engagement: Lab sections in introductory linguistics. *Language*, 98(4), e199–e223. Reeder, P. A., Newport, E. L., & Aslin, R. N. (2017). Distributional learning of subcategories in an artificial grammar: Category generalization and subcategory restrictions. *Journal of Memory and Language*, 97, 17–29. Terry, J., Ross, R. M., Nagy, T., Salgado, M., Garrido-Vásquez, P., Sarfo, J. O., Cooper, S., Buttner, A. C., Lima, T. J. S., Öztürk, İ., Akay, N., Santos, F. H., Artemenko, C., Copping, L. T., Elsherif, M. M., Milovanović, I., Cribbie, R. A., Drushlyak, M. G., Swainston, K., … Field, A. P. (2023). Data from an International Multi-Centre Study of Statistics and Mathematics Anxieties and Related Variables in University Students (the SMARVUS Dataset). *Journal of Open Psychology Data*, 11(1), 8. --- count: false ## Helpful resources - Jamieson's (2004) paper _[Likert scales: How to (ab)use them](https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2929.2004.02012.x)_ - UCLA Statistical Methods and Data Analytics's web page _[Ordinal Logistic Regression](https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/)_ - A. Solomon Kurz' (2021) blog post _[Notes on the Bayesian cumulative probit](https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/)_ - Gelman & Hill's (2007) book _[Data Analysis Using Regression and Multilevel/Hierarchical Models](https://www.cambridge.org/highereducation/books/data-analysis-using-regression-and-multilevel-hierarchical-models/32A29531C7FD730C3A68951A17C9D983)_ --- <!-- GRAVEYARD --> <!-- ## But first: --> <!-- Go to `menti.com` and enter code `3459 5977`. --> <!-- Or scan QR code: --> <!--  --> <!-- --- --> <!-- class: inverse, middle, center --> <!--
--> <!-- # The data we'll use --> <!-- --- --> <!-- ## The SMARVUS dataset (Terry et al., 2023) --> <!-- .center[SMARVUS = **S**tatistics and **M**athematics **A**nxieties and **R**elated **V**ariables in **U**niversity **S**tudents] --> <!-- -- --> <!-- .pull-left[ --> <!-- A survey of *n* = 18,841 students (mostly Psychology UGs) from 35 countries. --> <!-- Students rated their anxiety from 1 (no anxiety) to 5 (a great deal of anxiety) in scenarios like: --> <!-- - Studying for a statistics test. --> <!-- - Interpreting the meaning of a table in a journal article. --> <!-- - **Going to ask my statistics teacher for individual help with material I am having difficulty understanding. → ** --> <!-- ] --> <!-- -- --> <!-- .pull-right[ --> <!-- ```{r read-in-anx, include=F} --> <!-- anx <- read_csv('data/anx.csv') --> <!-- anx <- select(anx, unique_id, gender, score) |> --> <!-- rename(rating = score) |> --> <!-- mutate(gender = factor(gender, levels = c('Female/Woman', 'Male/Man', 'Another Gender'))) --> <!-- ``` --> <!-- ```{r bar-aggregated, echo=F, fig.width=7, fig.height=5.5} --> <!-- anx |> --> <!-- ggplot(aes(x = factor(rating))) + --> <!-- geom_bar(fill = '#2e3836') + --> <!-- theme_classic() + --> <!-- theme(text = element_text(family = "Fira Sans", size = 24)) + --> <!-- labs( --> <!-- x = element_blank(), --> <!-- y = 'Count', --> <!-- caption = 'n = 8,314' --> <!-- ) + --> <!-- scale_x_discrete(labels = c('1\n(no anxiety)', '2', '3', '4', '5\n(a great deal\nof anxiety)')) + --> <!-- NULL --> <!-- ``` --> <!-- ] --> <!-- --- --> <!-- class: inverse, middle, center --> <!--
--> <!-- # Modelling an ordinal variable --> <!-- ### The .mono-white[polr()] express --> <!-- --- --> <!-- class: inverse, middle, center --> <!--
--> <!-- # Interpreting *p*-values --> <!-- --- --> <!-- class: inverse, middle, center --> <!--
--> <!-- # Stats anxiety and gender --> <!-- <br> <br> --> <!--
**First:** Think to yourself about the questions. --> <!--
**Then:** Ask your neighbour what they think. What's their reasoning? What's yours? --> <!--
**Afterward:** we'll look at the model's estimates together and discuss. --> <!-- ```{r anx_fit2-display, eval = FALSE} --> <!-- anx_fit2 <- polr( --> <!-- rating ~ gender, --> <!-- data = anx, --> <!-- method = 'probit', --> <!-- Hess = TRUE --> <!-- ) --> <!-- summary(anx_fit2) --> <!-- ``` --> <!-- ``` --> <!-- ## Coefficients: --> <!-- ## Value Std. Error t value --> <!-- ## genderMale/Man -0.3280 0.03015 -10.880 --> <!-- ## genderAnother Gender 0.4846 0.11992 4.041 --> <!-- ## --> <!-- ## Intercepts: --> <!-- ## Value Std. Error t value --> <!-- ## 1|2 -0.9045 0.0169 -53.5402 --> <!-- ## 2|3 -0.2246 0.0150 -14.9847 --> <!-- ## 3|4 0.3318 0.0151 21.9158 --> <!-- ## 4|5 0.9889 0.0176 56.2958 --> <!-- ``` -->